AWB Database Scanner tutorial
Let's try to figure out how many total uncategorized images there are, using on a wiki database dump! Pre-steps *Get AutoWikiBrowser if you don't already have it. (The link in this bullet goes to Wikipedia's AWB info page. The link in the sentence above this section goes to a how-to on Community Central.) *Get the database dump for the wiki you wish to do this on. You don't need the version that includes the history. Database dumps are available at Special:Statistics. (This guide uses a dump for crashbandicoot/Bandipedia.) Steps # Open the AutoWikiBrowser program. # Click en.wikipedia in the program's footer and set the wiki to the wiki you will be running the scan on. (Here, we've set it to wikia and crashbandicoot.) # (Optional if you're just generating a list, probably-required if you plan to actually make edits via AWB) Click the red "User:" in the footer and log in with your account. You don't need any special permissions to use AWB on Wikia. (Though if you make a lot of semi-automated edits, it's recommended you have a separate account with a bot flag.) # Choose "Database dump" from the source in the make list area, then click the "make list" button. # You'll get this window. # Browse for your database dump file. (I haven't included a picture of this because it's a pretty simple instruction and also I couldn't keep my real name out of the screenshot.) # You might get this warning. That's fine. Hit OK. # Go to the namespace page and check next to the file namespace. You don't need to also check file talk. This will make the scanner only look in the file namespace. # You can do a few things on this next step. Go to the "text" tab. ## If you'd like to only search things that are just missing manually added categories (i.e. Category:Images manually added and not added via a template), then select "not" and add "Category:" without checking any of the boxes below the text box, as in this screenshot. ## If you want to search images missing categories added by any means (manually or via template), you will need to use regular expressions. In this case, you will want to select the "regex" checkbox (you don't have to worry about any of the new checkboxes that appear) and add something like the following: Category|template|other-template|etc|ditto. In the case of crashbandicoot wiki, I used Category|delete|fairuse|badimage|game-screenshot.This might be less nuanced than a more complex approach but I didn't want to confuse users new to regex by including characters that needed to be escaped. That includes manually-added categories and all the templates I could find that give categories on file pages. In regex, pipes (|) are an "or" operator, so it's asking the scanner to look for the text "Category" OR "delete" OR "fairuse", etc. and leave those out because you've put it in the "not" box. # Press start. A list will be generated. # You can alphabetize this list by clicking filter, making sure "sort alphabetically" is checked (and not changing anything else), then clicking apply. If you alphabetize, the list may look something like this. # From here, you can export this list as a text file or a csv, with or without wiki markup by clicking "save". (If you export as a text file with wiki markup, the text file will be a nice numbered list with links to each page that you could paste onto the wiki you're doing this check on.) Of course, you don't have to save the file. You can just take note of the number, or start editing from within AWB if you're comfortable and know how. From here you can close the program once you've taken whatever steps you've decided to do. Important: This will only list files that were included in the dump. Anything newer than the dump you're using will not be included in the list that is generated. Notes